Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach

نویسندگان

  • Jia Li
  • James Ze Wang
چکیده

Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2-D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 di erent concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images. Index Terms { Content-based image retrieval, image classi cation, hidden Markov model, computer vision, statistical learning, wavelets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Digital Imagery Data for Automatic Linguistic Indexing of Pictures

In this paper, we present a new research direction, automatic linguistic indexing of pictures, for data mining and machine learning researchers. Automatic linguistic indexing of pictures is an imperative but highly challenging problem. In our on-going research, we introduce a statistical modeling approach to this problem. Computer algorithms have been developed to mine numerical features automa...

متن کامل

ALIP: The Automatic Linguistic Indexing of Pictures System

In this demonstration, we present the Automatic Linguistic Indexing of Pictures (ALIP) system. The system annotates images with linguistic terms, chosen among hundreds of such terms. The system uses a wavelet-based approach for feature extraction, a statistical modeling process for training, and a statistical significance processor to annotate images. We implemented and tested our ALIP system o...

متن کامل

Evaluation strategies for automatic linguistic indexing of pictures

With the rapid technological advances in machine learning and data mining, it is now possible to train computers with hundreds of semantic concepts for the purpose of annotating images automatically using keywords and textual descriptions. We have developed a system, the Automatic Linguistic Indexing of Pictures (ALIP) system, using a 2D multiresolution hidden Markov model. The evaluation of su...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing.   This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Pattern Anal. Mach. Intell.

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2003